Efficient bi-allelic tagging in human induced pluripotent stem cells using CRISPR

Summary Allelic tagging of endogenous genes enables studying gene function and transcriptional control in the native genomic context. Here, we present an efficient protocol for bi-allelic tagging of protein-coding genes with fluorescent reporters in human iPSCs using the CRISPR-Cas9-mediated homology-directed repair. We detail steps for design, cloning, electroporation, and single-cell clone isolation and validation. The tagging strategy described in this protocol is readily applicable for knockin of other reporters in diverse cell types for biomedical research.

Note: The homology arms are approximately 500-1,000 bp upstream and downstream of the knock-in site. See Figure 1C for a schematic of donor vectors used for SIN3A as an example. To pick the genomic region as homology arm, we designed two pairs of primers 500-1,000 bp upstream of SIN3A stop codon and 500-1,000 bp downstream of SIN3A stop codon, respectively, by using Primer-BLAST (https://www.ncbi.nlm.nih.gov/tools/primer-blast/). The primers can also be designed by using other programs. The regions between the forward and reverse primers are used as homology arms. We chose a 501 bp sequence upstream of SIN3A stop codon as the left homology arm (SIN3A left in Figures 2A and 2B), and a 612 bp sequence downstream of SIN3A stop codon as the right homology arm (SIN3A right in Figures 2A and 2B).
b. Design the DNA oligos for Gibson assembly mediated donor vector cloning.
Note: Each oligo should contain a partial sequence from donor vector and partial sequence from homology arm of the target gene. Figure 2 and Table 1 illustrate the oligo design for the donor vector used for SIN3A bi-allelic tagging. To prevent the cutting of donor vectors Figure 1. Schematic of the sgRNA design for SIN3A bi-allelic tagging (A) Two sgRNAs were designed for SIN3A bi-allelic tagging. sgRNA-1 is located in upstream of the SIN3A stop codon, and sgRNA-2 is located in downstream of the SIN3A stop codon. (B) The oligos used for in vitro transcription of sgRNA-1 and sgRNA-2. The T7 promoter in the forward primers is labeled skyblue, and the partial sgRNA scaffold sequence in the reverse primers is labeled purple. (C) Schematic of the EGFP and mCherry donor design for SIN3A bi-allelic tagging. The sequences upstream and downstream of the SIN3A stop codon were selected as the left and right homology arms for C-terminal bi-allelic tagging.
from CRISPR-Cas9 system, we introduced point mutations in the sgRNA spacer or PAM sequence in DNA oligos ( Figures 2C-2E).  Table 1 for the examples of the Sanger sequencing primers used for SIN3A donor vectors. Note: For the bi-allelic tagging purpose, two donor vectors have different tags but same homology arms. For example, we designed two donor vectors with the same homology arms, but one contains EGFP and another contains mCherry for SIN3A bi-allelic tagging. Several studies have demonstrated that incorporating blocking mutation in sgRNA PAM or spacer sequence in donor template can increase knock-in efficiency. [5][6][7] Consider using synonymous mutations when introducing mutations at sgRNA spacer or PAM sequences in the homology arms to prevent editing to the donor vector. The knock-in efficiency positively correlates with the length of homology arms. 8,9 Larger vectors lead to decreased transfection efficiency and cell viability.  (Table 2). Incubate the plate with Matrigel in a 37 C / 5% CO 2 incubator for at least 30 min before use. e. Thaw vial in a 37 C water bath until only a small ice pellet can be seen. Quickly transfer the cell suspension to the prepared 15 mL conical tube using P1000 and mix 2-3 times. f. Centrifuge the conical tube for 5 min in a swinging-bucket centrifuge at 200 3 g and 20 C-25 C. g. Carefully aspirate the supernatant and resuspend the cell pellet in an appropriate volume (see Table 2 for volume) mTeSR1 medium with Rock inhibitor (final concentration is 10 mM). h. Remove the Matrigel from coated well and add cell suspension to the well. i. Place in a 37 C / 5% CO 2 incubator, gently shaking plate left/right and up/down to ensure even seeding. j. Replace media daily with mTeSR1 (without Rock inhibitor). Once until cells reach 80%-90% confluency, they need to be passaged. 6. Passaging iPSCs.
a. Prepare a Matrigel coated plate as described in steps 4e-4f. b. Label the plate with cell line, passage number, date, and operator initials. c. Remove 80%-90% confluent plate from incubator and aspirate spent media. Wash the plate with DPBS, using a volume equal to the volume of culture medium. d. Aspirate DPBS and add Accutase (see Table 2 for volume). e. Incubate the plate at 37 C for 5 min. Check cells for detachment by gently shake the plate.
If < 90% detachment is observed, incubate the plate for an additional 1-2 min, and check again until 90% detachment is observed. f. Add DMEM/F12 (23 of the volume of Accutase) and mix the cell suspension up and down. Transfer the cell suspension to an appropriately sized conical tube. Wash the plate with DMEM/F12 again (23 the volume of Accutase) and transfer into the same conical tube. g. Centrifuge at 200 3 g for 5 min at 20 C-25 C. h. Aspirate supernatant and resuspend cell pellet in an appropriate volume of mTeSR1 with Rock inhibitor (see Table 2 for volume). i. Aspirate the Matrigel from coated well and seed cells into the well with splitting ratio of 1:6 to 1:10. j. Place in a 37 C / 5% CO 2 incubator, gently shaking plate left/right and up/down to ensure even seeding. k. Replace media daily with mTeSR1 (without Rock inhibitor) until cells reach 80%-90% confluence. 7. Freezing iPSCs. a. Prepare cryovials by labeling with cell line, passage number, number of cells per vial, date, and operator initials. b. Collect the cells from the plate as cell pellet using the same procedure in passaging step. c. Aspirate supernatant and resuspend cell pellet in an appropriate volume of freezing medium. d. Dispense cell suspension into pre-labeled cryovials and place vials in a freezing container.

MATERIALS AND EQUIPMENT
Note: Store the resuspended DNA oligos at À20 C for up to years.
Note: Store the Matrigel solution at 4 C for up to 1 week. The amount of Matrigel is calculated based on the protein concentration of Matrigel stock. Note: Store the medium at 4 C for up to 2 weeks and À20 C for up to 6 months.

REAGENT or RESOURCE SOURCE IDENTIFIER
Note: The FACS buffer should be filtered with 0.22 mm filtration system and stored at 4 C for up to 2 months.

STEP-BY-STEP METHOD DETAILS Cloning donor vectors
Timing: 1-2 weeks (for steps 1 to 17) The purpose of this step is to clone the donor vectors for tagging the target gene. The left and right homology arms are inserted into the donor using Gibson assembly.
Note: The isolated genomic DNA will be used as template for amplification of homology arms. We isolated genomic DNA from WTC11-i 3 N iPSCs, 11 which is the parental line we used for SIN3A bi-allelic tagging. 3. Perform amplification using the cycling parameters below.
Note: The temperature used for amplification depends on DNA polymerase and DNA oligos.
4. Digest the backbone (the EGFP donor and mCherry empty donor vectors without homology arm) using the reaction below for left homology arm insertion ( Figure 3A). Incubate the digestion reaction at 37 C for 1 h.
Note: The backbones we used are two premade pUC57 based empty donor vectors in our lab, which contain either EGFP or mCherry fluorescent protein without homology arms. Other basic cloning vectors can also be used as backbone for donor vector cloning, for example pUC19. The reporters used for tagging can be inserted into the basic cloning vector before homology arms cloning, or together with homology arms using multiple fragment assembly.
5. Purify the digested backbone and PCR amplified left and right homology arms using gel extraction and Wizard SV Gel and PCR Clean-Up System. Follow the manufacturer's protocol and elute in 50 mL elution buffer (https://www.promega.com/-/media/files/resources/protcards/wizard-sv-gel-andpcr-clean-up-system-quick-protocol.pdf?rev=29aa4610719c4f96afd213c345bc76d3&sc_lang=en). Check the concentration of each product via NanoDrop. 6. Set up one Gibson assembly reaction as below for EGFP donor, and one for mCherry donor separately. Incubate the reaction at 50 C for 1 h to insert the left homology arm into the digested EGFP and mCherry empty donor vectors.
Steps Temperature Time Cycles Initial Denaturation 98 C 1 min 1 Note: We used Sanger-Fs1 and SIN3A-L-R for left homology arm confirmation, SIN3A-R-Neo-F, SIN3A-R-Blast-F and Sanger-Rs1 for right homology arm confirmation (see Table 1 for oligo sequences).
9. Perform amplification using the cycling parameters below. Run the PCR product on the agarose gel and select the colonies with successful insertion based on the size of PCR product.    Figure 3B). Incubate the reaction at 37 C for 1 h.
15. Purify the digested vectors from step 14 using gel extraction and Wizard SV Gel and PCR Clean-Up System. Follow the manufacturer's protocol and elute in 50 mL elution buffer (https://www. promega.com/-/media/files/resources/protcards/wizard-sv-gel-and-pcr-clean-up-system-quickprotocol.pdf?rev=29aa4610719c4f96afd213c345bc76d3&sc_lang=en). Check the concentration of each product with NanoDrop. 16. Repeat steps 6-11 to insert the right homology arm into donor vectors, and check the sequence of right homology arm using the Sanger sequencing primer Sanger-Rs1 in Table 1. 17. Make midi-prep for EGFP and mCherry donor vectors with both left and right homology arms using QIAGEN Plasmid Plus Midi Kit following manufacturer's protocol (https:// www.qiagen.com/us/Resources/ResourceDetail?id=3da21fc3-a078-4665-aefe-06154db2b6d2& lang=en).
Note: We will add the donor plasmids together with Cas9/sgRNA ribonuclease protein complex into nucleofection reaction in later step. Their volume should not exceed 10% of the total reaction volume. We usually elute the donor vectors from midi-prep with 100 mL elution buffer to make the final concentration of each donor vector around 2,000 ng/mL. The quality of donor plasmid is important for successful tagging. We checked the integrity of donor vectors before use by using agarose gel electrophoresis.

Synthesis of sgRNA
Timing: 1 day (for steps 18 to 20) The purpose of this step is to synthesize sgRNA using the precision sgRNA synthesis kit, which is a complete in vitro transcription system for rapid synthesis and purification of sgRNA ready for nucleofection.
18. PCR assemble the sgRNA DNA template. a. Set up the PCR assembly reaction as below to amplify the sgRNA DNA template. The DNA oligos are listed in Figure 1B. a. Set up the following in vitro transcription (IVT) reaction, adding the reaction components in the order given.
b. Mix the reaction components thoroughly, centrifuge briefly to collect all drops, and incubate at 37 C for 2 h.
Note: You can set up multiple transcription reactions or extend the incubation up to 4 h for higher sgRNA yields. In our hand, each transcription reaction can produce about 40 mg sgRNA at 2 h of incubation.
c. Add 1 mL of DNase I into the reaction mix after the transcription reaction, mix by pipetting, and incubate at 37 C for 15 min to remove the sgRNA DNA template. d. Dilute 0.5 mL of the IVT product in 10 mL of nuclease-free water, and mix with RNA loading dye. e. Heat the sample at 70 C for 10 min to denature the sgRNA and chill on ice, then run on a 2% Agarose Gel against an RNA Ladder to check the integrity of synthesized sgRNA. The expected sgRNA transcript size is 100 bases.  g. Centrifuge the empty purification column for an additional 60 s at 14,000 3 g to completely remove any residual Wash Buffer and transfer the purification column to a clean 1.5 mL collection tube. h. Add 15 mL of nuclease-free water to the center of the purification column filter, and centrifuge for 60 s at 14,000 3 g to elute the sgRNA. i. Check the concentration of the sgRNA with NanoDrop or Qubit. Make 4 mg aliquots for each sgRNA with PCR tubes and freeze them at À80 C. sgRNA is generally stable at À80 C for more than one year without degradation.

Deliver CRISPR-Cas9 genome editing reagents into cells using nucleofection
Timing: 3-4 days (for steps 21 to 36) The purpose of this step is to deliver the in vitro assembled Cas9/sgRNA ribonucleoprotein complex and donor vectors into the cells using nucleofection. We used the Lonza Nucleofector 2b Device and Human Stem Cell Nucleofector Kit 1 for nucleofection of iPSCs.

Prepare the iPSCs for nucleofection. Grow 1 million iPSCs for 1 nucleofection.
Note: High quality iPSCs with normal morphology and growth rate are very important for successful tagging. The iPSCs are ready for nucleofection when they reach to 80%-90% confluency. Meanwhile, maintain the growth of iPSC, which will be used as control in the cell sorting step.

Prepare the Matrigel coated well for seeding nucleofection cells.
Note: We usually prepare 2 wells of a 6-well plate for 1 nucleofection.
Note: Make sure the total volume of tube 1 and tube 2 does not exceed 10 mL.
24. Harvest the cells by following steps 6c-6f in before you begin part. a. Count and aliquot 1 million cells into one 15 mL conical tube. b. Centrifuge the 1 million cells at 200 3 g for 5 min at 20 C-25 C. 25. During centrifuging, prepare 4 mL mTeSR1 medium with Rock inhibitor for 1 nucleofection. 26. Remove the Matrigel from the 6-well plate and add 1 mL of the prepared medium into each well. 27. Transfer the conical tube with cell pellet back to cell culture hood when centrifuging is done and proceed to the next step. 28. Combine 82 mL nucleofector solution with 18 mL supplement from Lonza Human Stem Cell Nucleofector Kit 1, and then combine with Cas9/sgRNA complex and donor plasmids together. 29. Remove the supernatant completely from the cell pellet. Resuspend the cell pellet carefully with the mixed nucleofection reagents from step 28. 30. Transfer the cell suspension into the nucleofector cuvette using a P200 pipet and avoid creating any bubbles. Note: Any bubbles in the cuvette will destroy the nucleofection reaction or decrease the transfection efficiency.
31. Gently tap the nucleofector cuvette to make sure the sample covers the bottom of the cuvette. 32. Transfer the cuvette with a closed lid to Lonza nucleofector and start nucleofection process.
Note: We used Lonza nucleofector 2b device with program A-023.
33. Take out the cuvette immediately when nucleofection process is finished. 34. Resuspend the cells in the cuvette with the prepared mTeSR1 medium. a. Add 1 mL of the prepared mTeSR1 medium from step 25 into the cuvette. b. Transfer the cell suspension to the prepared Matrigel coated wells from step 26 using the provided single use pipet by adding 0.5 mL into each well. c. Wash the cuvette with 1 mL medium again and transfer 0.5 mL to each well.
Note: In this step, we seed the 1 million cells after nucleofection into 2 wells of a 6-well plate with 0.5 million per well.
35. Gently shake the plate left/right and forward/backward to ensure even seeding. Incubate the cells in a 37 C / 5% CO 2 incubator. 36. 24 h after nucleofection, change the culture medium daily with mTeSR1 medium without Rock inhibitor and grow the nucleofection cells for 3-4 days to let them reach 80%-90% confluency.
Note: Lonza 4D-Nucleofector system can also be used for human iPSC nucleofection. We tested Lonza 4D-nucleofector with P3 Primary Cell 4D-Nucleofector X Kit L for WTC11 iPSCs and turned out with successful nucleofection.

Generating tagged clonal cell lines with single-cell sorting
Timing: 4 h (for steps 37 to 41) The purpose of this step is to sort the bi-allelically tagged single cell into 96-well plate by Fluorescence-activated Cell Sorting (FACS) and establish the bi-allelically tagged iPSC clones.
37. Prepare Matrigel coated 96-well plate. 38. Remove the Matrigel when the coating is done and add 150 mL of the mTeSR1 medium (with Rock inhibitor and 1% Penicillin-Streptomycin) into each well of the 96-well plate.
Note: We usually prepare 2 plates for one target gene.

Detach the nucleofection cells and control cells without nucleofection with Accutase.
Resuspend the cells with FACS buffer to 2 3 10 6 cells/mL. 40. Sort the bi-allelically tagged single cells into the prepared 96-well plate by using BD FACSAria II cell sorter. Please ensure the cell sorter has lasers to detect the fluorescent proteins used for tagging, for example, 488 nm laser for detecting EGFP, 561 nm laser for detecting mCherry (Figure 4). a. Use the control cells and set the gate to separate the cells from debris based on the forward scatter area (FSC-A) and side scatter area (SSC-A). b. Use the control cells and set the gate for single cells using forward scatter (FSC) and side scatter (SSC). c. Set the baseline for EGFP and mCherry signal using control cells. d. Switch to the nucleofection cells and set the gate for EGFP and mCherry double positive cells.
e. Sort the EGFP and mCherry double positive cells from nucleofection cells into 96-well plate with 1 cell/well using single cell mode.
Note: For successful bi-allelic tagging, you should see four cell populations in FACS plot, two single positive populations, one double positive population and one double negative population ( Figure 4B). The bi-allelic tagging efficiency may vary among different targets. The percentage of the double positive population for SIN3A bi-allelic tagging is about 0.15%.
Note: Plate alignment is a crucial step for plate-based single cell sorting. We noticed the plate sizes vary from different manufacturers. To ensure cells are appropriately collected at the center of each well, please check the plate position by releasing testing droplets to the wells.
Note: Refresh half media with 1% Penicillin-Streptomycin every 3 days for the first week with a 200 mL multichannel pipette without perturbing the cells. The Penicillin-Streptomycin will inhibit any contamination from sorting. Clones become visible after 1 week. Check the clones with a microscope and mark the clones with only one colony in the well. After one week, refresh half or whole media every 2 days for another week.

Validating tagged cell lines by genotyping PCR
Timing: 1 day (for steps 42 to 52) The purpose of this step is to verify the bi-allelically tagged clones with genotyping PCR.
42. Design the genotyping primers to confirm the successful bi-allelic tagging.
Note: We designed 4 pairs of genotyping primers for SIN3A bi-allelic tagging ( Figure 5A, Table 3). For each pair, one primer should target the genomic sequence outside of the homology arm, and another primer should target the insertion sequence, for example, the EGFP, mCherry, neomycin, and blasticidin sequences for SIN3A bi-allelic tagging.
43. Start the experimental validation of the clones when they reach about 80% confluency (usually takes 2 weeks). Check the labeled clones in step 41 with a microscope and select the clones with normal morphology for genotyping.
Note: We usually pick 24 clones at one time.
44. Prepare Matrigel coated 24-well plates and add 0.5 mL mTeSR1 medium with Rock inhibitor into each well.  49. Perform amplification using the cycling parameters below. 50. Check the PCR product with agarose gel electrophoresis.
Note: We checked the SIN3A bi-allelic tagging clone with two types of PCR, either amplify the left part and right part of the insertion ( Figure 5B), or amplify the whole insertion region (Figure 5C). The amplification of the left part and right part of the insertion confirms the insertion happened in correct genomic location. The amplification of the whole insertion region confirms the whole insertion happed in a bi-allelic manner. 51. Check the DNA sequence of the PCR product with Sanger sequencing (Figure 6). 52. Expand the correct clones based on the genotyping PCR and Sanger sequencing, and make cryogenic stock for them.

Functional characterization of the tagged cell lines
Timing: 2 weeks (for steps 53 to 55) The purpose of this step is to characterize the bi-allelically tagged clones using different functional assays. Here, we checked the expression of fluorescent reporters in iPSCs, iPSCs-derived glutamatergic neurons and iPSCs with different passages to confirm the genetic stability of the reporters. We also checked the expression of pluripotency markers in the tagged clones to verify that the reporters do not change the pluripotency of iPSCs. Note: For SIN3A bi-allelically tagged clone, we saw both EGFP signal and mCherry signal, which indicate the successful expression of fluorescent reporters from SIN3A locus ( Figures 7A and 7B).
54. Check the expression of pluripotency markers in the bi-allelically tagged clones. a. Extract the total RNA from the bi-allelically tagged clone using the QIAGEN RNeasy Plus Mini Kit following the manufacturer's protocol (https://www.qiagen.com/us/Resources/ ResourceDetail?id=1d882bbe-c71d-4fec-bdd2-bc855d3a4b55&lang=en). b. Perform reverse-transcription reaction to make cDNA from the extracted total RNA using the iScript cDNA synthesis kit following the manufacturer's protocol (https://www.bio-rad.com/ webroot/web/pdf/lsr/literature/4106228.pdf). c. Run qPCR to check the expression of the pluripotency markers from both control cell and tagged clone.
Note: We checked the expression of NANOG, POU5F1 and SOX2 in the parental cell line and the bi-allelically tagged clone and did not observe a significant difference, which indicates the insertion of fluorescent reporters do not influence the pluripotency of the tagged clone ( Figure 7C). Note: For the differentiated glutamatergic neurons from SIN3A bi-allelically tagged clone, we saw both EGFP signal and mCherry signal, which indicate the fluorescent reporters can be maintained during neuronal differentiation ( Figures 7D and 7E).

EXPECTED OUTCOMES
Tagging endogenous genes is a powerful technology for monitoring gene expression and characterizing protein function. The bi-allelic tagging enables real-time monitoring gene expression and protein subcellular localization in an allelic manner, characterizing cis vs. trans elements that affect the gene expression, and facilitate pooled CRISPR screening to identified genetic components essential for phenotypes associated with target gene expression. The expected outcome of this protocol will be the bi-allelically tagged clones for the gene-of-interest. Tagging the target gene with one fluorescent protein results in both homozygous and heterozygous knock-in. The cells with heterozygous knock-in could have undesired indel on the non-tagged allele. Two alleles tagging can minimize such unwanted editing events. Furthermore, this protocol can be applied to simultaneously tag two genes.
We usually get about 0.1% positive rate of double positive cells in human iPSC 3-4 days after transfection without any selection. Compared to double positive cells, the positive rate of cells with only one fluorescent protein is about 6 to 16-fold higher. For any batch of transfection, we only need 100,000 cells to sort a full plate of single cells into a 96-well plate. From a 96-well plate, we usually get about 30 survival clones and 80% of the established clones have successful knock-in of both EGFP and mCherry fluorescent reporters, making entire tagging strategy highly efficient.

LIMITATIONS
The gene expression level varies among different genes and diverse cell types. We select the positive allelic tagging cells based on the expression of fluorescent proteins driven by the promoter of the gene-of-interest. This strategy is not suitable for silent genes. To solve this problem, an additional promoter can be added upstream of fluorescent proteins and flanked by loxP sites. The additional promoter can be removed by Cre-mediated excision in the final clones. Additional optimization includes using efficient sgRNA, increasing the transfection efficiency, and antibiotic selection will further increase the success rate.

Potential solution
Double check the nucleofection program is optimized for your cell line. Double check the quality of donor vectors by gel electrophoresis and NanoDrop. Purify the donor vectors again if the quality is not good for nucleofection.

Problem 2
The bi-allelic tagging efficiency is low (section ''generating tagged clonal cell lines with single cell sorting'', step 40).

Potential solution
Grow the nucleofection cells longer to get more cells for sorting. Treat the nucleofection cells with antibiotics when antibiotic resistance genes are added in the donor vector to enrich the cells with successful tagging. Use longer homology arms to increase tagging efficiency. Test sgRNA cutting efficiency in vitro and select the effective sgRNA for nucleofection. Reduce the size of the donor vectors to increase the transfection efficiency of donor vectors. Test the transfection efficiency using a control plasmid, for example the pmaxGFP vector in LONZA human stem cell nucleofector kit.

Problem 3
The low survival rate from single cell sorting in 96-well plate (section ''generating tagged clonal cell lines with single cell sorting'', step 41).

Potential solution
Sort single cells into multiple 96-well plates to get more survival clones. Add live cell marker in cell suspension for sorting, for example, SYTOX Blue or DAPI, and exclude the dead cells based on the ll OPEN ACCESS signal of the live cell marker when setting up the gate for single cell sorting. Use the media specifically designed for improving the survival of human iPSCs in single-cell workflows, for example, CloneR, CloneR 2.

Problem 4
No band or only non-specific band is amplified (section ''validating tagged cell lines by genotyping PCR'', step 50).

Potential solution
Optimize the PCR program or use another DNA polymerase. Purify or clean the extracted genomic DNA to provide the high quality DNA template for PCR. Design and order more oligos for PCR.

Problem 5
The Sanger sequencing reaction is failed or shows mixed peaks (section ''validating tagged cell lines by genotyping PCR'', step 51).

Potential solution
Amplify more PCR product and clear the PCR product using gel extraction. Extract the genomic DNA using other genomic DNA traction kit. Make sure the cell population is clonal.

RESOURCE AVAILABILITY
Lead contact Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, Yin Shen (yin.shen@ucsf.edu).

Materials availability
The EGFP and mCherry donor backbone used in this study are available on request.

Data and code availability
This study did not generate or analyze any datasets or code.